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HETERO-ASSOCIATING COILEP-COIL PEPTIDES 

The presenl invention relate?, to methods for the identification of neve! hetero- 
associating coiled-coil peptides and uses o^ these peptides for hetero-dirr erization of 
fusion proteins. It furthermore relates to vectors, host cells useful for the p oduction of 
these novel hetero-association -peptide* and (poly)peptides/proteins comprising these 
peptides. 

Increasingly, there is a need for proteins whi ;h combine two or more functions, such as 
binding to two different specificities, or binding and enzymatic activity, n a single 
structure. Typically, proteins which combine two or more functions are prepared either 
as fusion proteins or througr chemical conjugation of the component functional 
domains. Both of these approaches suffer fiom disadvantages. Genetic "si igle chain- 
fusions suffer the disadvantages that (i) only a few (two to three) proteins cen bs fused 
(1), (ii) mutual interference between the component domains may hinder foldhg, and (iii) 
the size of the fusion protein may make it difficult to prepare. The alternative tj chemical 
cross-linking in vitro following purification of independently expressed proteins , is difficult 
to control and invariably leads :o undefined products and to a severe loss in yield of 
functional material. 

A third approach takes advantage of using genetic fusions of functional to association 
domains which lead to a self-association on oo-expression in appropriate host calls. To 
assemble at least two different fragments fused to association domains, ths domains 
must have a tendency to form hetero-multimsrs. In one approach, a natural protein or 
protein domain was dissected and fused to protein partners to achieve hetero- 
association of the fusion protein?! via the reconstitution of the native-like structure of the 
dissected protein or protein domain (WO 96/1 3583). 

In principle, hetero-association can be aahievt d with complementary helices such as the 
hetero-dimerizing Jun and Fos ziopers of the AP-1 transcription factor (2) or other helical 
coiled-coil structures which are involved in the oligomerization of a wide variety of 
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proteins. Because of their sma l size and structural regularity, they have also been used 
as artificial domains to mediate* oltgoroerizarion of various proteins (3, 4). For example, 
the association of two separately expressed scFv antibody fragments by C-terminally 
fused amphipathic helices in viv provides homo-dimers of antibody fragme its ,n E. coli 
(WO93/15210; 5, 6). Coiled coils consist of two or more amphipathic heliois wrapping 
around each other with a slight superooil. They contain a characteristic heptad repeat 
(abcdefg) n with a distinct pattern of hydrophobic and hydrophilic residues (Fig. 1A, (7, 
8)). The positions a and d, whbh form the hydrophobic interface between :he helices,' 
are usually aliphatic and a have profound effect on the oligomerization state ;9, 1 0). The 
positions b, c, e. g, and f are sctvent-exposeiand usually polar. The positions e and g, 
which flank the hydrophobic core, can make nterhelical interactions betweer g, and e'J' 
residues, and thereby mediate heterospecific pairing (1 1-14). 

As most naturally-occurring coiled coils are homodlmeric, synthetic sequences have 
been designed to promote specific hetero oligomerization (1 1, 13-15). Howe ver, it was 
observed that designed coiled coils which behave well as synthetic peptides failed in 
fusion proteins expressed in E. coH, as they were proteolytically degraded. 

Thus, the clear disadvantage of association domains based on hetero-asscciating 
helices is their pseudo-symmetry and thel similar periodicity of hydropiobic and 
hydrophilic residues. This strucrural similarity resulted in a strong tendency to form 
homo-dimers and thus to lower significantly the yield of hetero-dimem (2, 16). 
Furthermore, the formation of .Jun/Fos hete ro-dimers is kinetically disfavoured and 
requires a temperature-dependent unfolding of the kinetically favoured honio-cimers, 
especially Jun/Jun homo-dimer* (WO 93/1£210; 2, 16). Because of the need for 
additional, purification steps to seoarate the ur wanted homo-dimers from hetc ro-dlmers 
and the resulting decrease in yield, hetero-as jociation domains based on amphipathic 
helices have so far not resulted in practica advantages compared to conventional 
chemical coupling. 



Further, it is not currently possble to predict sequences of coiled coil-forming peptides 
that will simultaneously have high stability and heterospecificity as well as advantageous 
in-vivo properties, such as resistance to proteases. This is crucial ta practical 
applications of optimal interacting heterc dimers for in vivo studies of protein 
oligomerization, e.g. the design of bispecific miniantibodies (17). 

Thus, the technical problem underlying the present invention is to provide association 
domains based on helical coiled-coil structures which lead to hetero-association. The 
solution to the above technical problem is achieved by the embodiments characterized 
in the claims. Accordingly, the present invention provides a method which aMows to 
identify hetero-associating (polypeptides. Ths technical approach, i.e. the design of an 
appropriate coiled-coil library aid screening by using a library-vs-library approach is 
neither provided nor suggested l>y the p"ior ait. 

Thus, the present invention relates lo a method for the identification of leterov 
associating (polypeptides complsing the steps of: 

(a) providing a library A of (poly)peptices/proteins comprising (polypeptides A m 
having the general formula: 

VAQLXEXVKTLXAXZY ELXSXVQRL XEXVAQL 

wherein X represents a mixture of E K, Q, and R, and wherein 2 reoreoents a 
mixture of N and V, 

(b) providing a library B of (poly)peptic:es/proteins comprising (polyp aptides B n 
having the general fornr ula: 
VDELXAXVDQLXDXZYALXTXVAQLXKXVEKL 

wherein X represents a mixture of E K, Q, and R, and wherein Z represents a 
mixture of N and V; 

(c) combining in a common medium the (poly)peptides/proteins of said libraries A 
and B; and 

(d) screening or selecting ior a screenable or selectable property caused by the 
hetero-association of a .'polypeptide A m with a (polypeptide B n . 
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The term "(polypeptide" relate to molecules consisting of one or mom chains of 
multiple, i. e. two or more, amino .acids .linkec via peptide bonds. 

The term "protein" refers to (poly)peptides where at least part of the (polypeptide has or 
is able to acquire a defined ihree-dimens onal arrangement by forming secondary, 
ternary, or quaternary structure, within and/or between its (polypeptide chain(s). This 
definition comprises proteins s jch as naturally occurring or at least partially artificial 
protems, as well as fragments o • domains of /.hole proteins, as long as these- fragments 
or domains have a defined three -dimensional arrangement as described abo /e. 
In this context, the commonly known one-lette r code for amino acid residues s used. 
The term " screenable or selectable property refers to a property which is generated in 
the event of a successful interaction taking place during screening or selection. 
Examples for screenable selectable properties include, but are not limited to. binding to 
a target or presentation of a tare ec for ligand-binding, enzymatic activity, transaction 
of transcription of a reporter gene such as beta-galactosidase, alkaline phosohatase or 
nutritional markers such as hlsC: and leu, or resistance genes giving resisfcmce to an 
antibiotic such as ampicillin, chlo-amphenicol, kanamycin, zeocin, neomycin, tetracycline 
or streptomycin. In another embodiment, the selectable or screenable property can be 
restoration of phage infectivity to a filamentous phage rendered non-infoctious by 
deletion of the N-terminal domair (s) of the genelll protein (U.S. Patent No. 5,£ 1 4,548). 

In a preferred embodiment, the nvention relr.tes to a method wherein said Ibraries A 
and B are provided by providing libraries o nucleic acid sequences enccdin 3 said 
(poly)peptides/proteins, followed by causing o- allowing the expression of said libraries 
of (poly)peptides/proteins. 

Methods for providing libraries o: : nucleic acic sequences, and for causing o- allowing 
their expression, either in vivo arter transformation, transfection or transcucton of 
appropriate host cells, using appropriate vectors, containing all elements necessary for 
transcription and translation.such as promoter/. operator elements etc., or in vitn using in 
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vitro expression systems are well known to anyone of ordinary skill in the. rrt (see e.g. 
Sambrook et al., 1 989, Molecular Cloning: a laboratory Manual, 2nd ed.). 

Further preferred is a method wherein said common medium are host celli, each cell 
harbouring nucleic acid sequences encoding a (poly)peptlde/protein of each of said 
libraries A and B. 

Most preferred is a method wherein said (poly)peptides/proteins of said librar es A and B 
further comprise either a N- or a C-terminal fragment of the murine DHFR enzyme, and 
wherein said screenable or selectable property is insensitivity of the host cell to 
trimethoprim by reconstitute of the DHFR enzyme on hetero-association of 
(polypeptides A m and B n . 

The DHFR assay has been published (WO f)8/34120; 19) and is further exemplified in 
the examples. 

In another embodiment, the present invention relates to a hetero- associating 
(poly)peptide A m taken from the 1st of: 

WIN2|lFjAl : VAQLEEKVKTLRAONYELKSRVQRLREQVAQL 
WINZIPA2: VAQLRERVKTLRAQMYELESEVQRLREQVAQL 
WINZIPA3: VAQLQE KVKTLRAR M YE LKSE VO R LEEKVAQL 
WIN2IPA4: VAQLEEQVKTLQARNYELKSKVQRLKEKVAQL 
W I NZI P A5: VAQLE E R VKTLRAOM YELKSK VQRLE E QVAQL 
WIN2IPA6: VAQLEEQVKTLEAENYELKSKVQRLRERVAQL 
WINZIPA7: VAQLQEQVKTLEAOMYELESEVQRLKEQVAQL 
WINZIPA8: VAQLEERVKTLKAE NYELESEVQRLKERVAQL 

WINZIPA9: VAQLEEKVKTLKAKNYELKSKVORLKEKVAQL \ 
WINZIPA1 0: VAQLQEEVKTLQ^ ENYELRSEVQRLEEEVAQL 
WINZIPA1 1 : VAQLRERVKTLRARMYELQSKVQRLKERVAQL 
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Furthermore, the present invention relates to a hetero-associating (poly)pepti ie B n taken 
from the list of: 

WINZIPB1 : VDELQAEVDQLQDENYAL<TKV AQLRKKVEKL 
WINZIPB2: VDELKAEVDQLQD3NYALTTKVAQLRKEVEKL 
WIN2IPB3: VDE LEAE V DQLKD' 3N YALKTK V AQLQKQVEKL 
WINZIPB4: VDELRAKVDQLQDENYALETEVAQLQKRVEKL 
WINZIPB5: VDELEAEVDQLEDQNYALQTRVAQLEKRVEKL 
WINZIPB6: VDELKAKVDQLKDKNYALRTKV AQLRKKVEKL 
WINZIPB7: VDELRAQVDQLQDKNYALRTRYAQLKKRVEKL 
WINZIPB8: VDELQAEVDQLQDQNYALRTQVAQLKKKVEKL 
WINZIPB9: VDELRAQVDQLEDQNYALSTQVAQLEKEVEKL 
WINZIPBI 0: VDELQAKVDQLKDENYALQTK /AQLQKRVEKL 
WINZIPBI 1 : VDELRAEVDQLEDENYALRTR7AQLRKQVEKL 

Particularly preferred is the use of a hetero-associating (polypeptide according to the 

present invention for the identification of optimized hetero-associating (polypeptides in a 

method according to the present invention, wherein one of the hetero-iissociating 

• • i 

peptide WinZipA m as listed hereinabove is used instead of library A of 

(poly)peptides/proteins comprising (poly)pepides A m in step (a) above, or wherein a 

hetero-associating peptide Win2:ip B n a« listed hereinabove is used instead if horary B 

of (poly)peptides/proteins compr sing (poly)peptides B n in step (b) avove. 

In a still further preferred embodiment, the oresent invention relates to an optimized 
hetero-associating (polypeptide obtainable by the method of the present invention. 

In a most preferred embodiment, the present invention relates to a pair of letero- 
associating (poly)peptides taken from the list of: 
WinZipAl and WinZipBI 
WinZipA2 and WinZipBI 
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WinZipAl and WinZip B2 
WinZipA3 and WinZip B3 
WinZipA4 and WinZip B4 
WinZipA5 and WinZip B5 
WinZipA6 and WinZip B6 
WinZipA7 and WinZip B7 
WinZipA8 and WinZip B8 
WinZipA9 and WinZip B9 
WinZipAlO and WinZip B10 
WinZipAl 1 and WinZip B11 

In a yet further preferred embodiment, the invention relates to a (poly)pepiide/protein 
comprising one of the hetero -associating (polypeptides, or an optimized hetero- 
associating (polypeptide of the present inven Ion, and a further (polypeptide/ protein. 

In that context, "(poly)peptide/protein comprising one of the hetero-ssscciating 
(polypeptides, or an optimized hstero-asisocl&ting (polypeptide of the present invention, 
and a further (polypeptide/prote in" refers to all constructs which comprise >ne of the 
hetero-association peptides according to the present invention and additiona moieties. 
This comprises (polypeptides/proteins which are expressed from a contiguous nucleic 
acid coding sequence. Additionally, this conprises constructs where the individual 
components are expressed from different nucleic acid coding sequences, or where the 
components are produced by pepiide synthesis, and where the separate ccmponents 
are linked by the formation of disulfide bonds or by chemical conjugation. 

Still further preferred is a (polypoptide/piotein wherein said further (polypeptide/protein 
is an enzyme, a "toxin, a cytokine, a rr etal binding domain, a transcription factor, a 
member of the immunoglobulin iuperfarnily, e bioactive peptide of 5 to 15 anino acid 
residues, a peptide hormone, a growth factor, a lectin, a lipoprotein, a peptido which is 
able to bind to an independent binding entity, era functional fragment of any said further 



(poly)peptide/protein. 
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In a most preferred embodiment, the present invention relates to a hetero associated 
(poly)peptide/protein comprising at least two (poly)peptide/proteins of the present 
invention, associated by hetero-association o a hetero-associating (poIy)pepiiLe A m and 
a hetero-associating (poly)pepticle B n . 

The term "hetero-associated (poly)pepti<je/protein ,, refers to all bispecific and or bivalent 
complexes formed by taking advantage of thu hetero-associating peptides according to 
the present invention. These include, for sxample, constructs where said ''further 
(poly)peptides/proteins" are twc antibody fragments directed againsi different 
specificities. Such bispecific constructs can be used to increase selectivity o : |antibody- 
based approaches in therapy of diseases whure the target cells exhibit a pat ern of two 
cell-surface markers distinct from that of non-target cells which may present one of the 
two markers. Furthermore, one of the antibocy specificities may be directed to a target 
cells, whereas the second may be used to target a drug carrier moiety selecti/ely to the 
target. Additionally, antibody fragments as targeting vehicles may be combined with 
(poly)peptides/proteins which serve as effector domains, such as enzymes or signalling 
molecules. 

Particularly preferred is a DNA sequence encoding a hetero-associating (pcly)peptide 
taken from the list of WINZIPA1 to WINZIPA11 and WinZipBI to WinZipB11, or 
encoding an optimized hetero-associating (poly)peptide or a (poly)peptide/pro;ein of the 
present invention. ' f 

! 

Further preferred is a DNA sequence encoding a hetero-associating (poy)peptide 
wherein said DNA sequence hybridizes under stringent conditions to a DNA sequence 
encoding a hetero-associating (poly)peptide taken from the list of WirZipAl to 
WinZipA11 and WinZipBI to WinZipB1 1. 
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As used herein, the term "hybric izes under s ringent conditions" is intended o describe 
conditions for hybridization and washing under which nucleotide sequences a: least 60% 
homologous to each other typically remain hybridized to each other. Preferably, the 
conditions are such that at least sequences at least 65%, more preferably at least 70%, 
and even more preferably at hsast 75% homologous to each other typically remain 
hybridized to each other. Such stringent cone itions are known to those skille J in the art 
and can be found in Current Protocols in Mc lecular Biology, John Wiley & Sons, New 
York. (1989), 6.3.1-6.3.6. A preferred, non-lmitfng example of stringent hybridization 
conditions is hybridization in 6 sodium chloride/sodium citrate (SSC) at aoout 4S°C, 
followed by one or more washes in 0.2 x SSC 0.1% SDS at 50°-65°C. 

In still another embodiment the invention relates to a vector comprising a DNA sequence 
according to the invention. 

In a further embodiment, the invention 'elates to a vector comprising DNA sequences 
encoding at least two (polypeptide/, comprising at least a hetero-associating 
(poly)peptide A m and a hetero-associating (po y)peptide B n . 

Vectors which can be used in accordance with the present invention are wel -known to 
the practitioner in the art. 

In another embodiment, the invention relates \o a host cell containing at least one vector 
of the present invention. 

In a further preferred embodiment the host cell is a mammalian, preferably hunan cell, a 
yeast ceil, an insect cell, a plant c ell, or a bacterial, preferably Exoli cell. 

In a highly preferred embodiment the invention relatesJo a method for the production of 
a hetero-associating (polypeptide, an cptimized hetero-associating (poly), a 
(poly)peptide/protein, or a hetoro-associated (poly)peptide/protein of ttu* present 
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invention, which comprises cultuiing the host cell of the present invention in a suitable 
medium, and recovering said (poly)pep:ide cr said (poly)peptide/protein prc.duc.ed by 
said host cell. 

In a most preferred embodiment, the invention relates to a pharmaceutical ccmposition 
comprising the hetero-assoclated (poly)peptidu/proteih or the present invention. 

Still further preferred is a diagnostic composition comprising the hetero-f.ssociated 
(poly)peptide/protein of the present invention. 

Further preferred Is a kit containing at least on 9 of 

a hetero-associating (poly)peptice, an optimised hetero-associating (polypeptide, or a 
(poly)peptide/protein of claims, or a heterc -associated (poly)peptide/prot€ in of the 
present Invention; or 

a vector of claims according to the invention. 
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FIGURE CAPTIONS 

Figure 1: (A) Schematic representation of a parallel dimeric coiled coil. Ha:ched bars 
indicate the possible interhelical interac-ions oetween e and g positions. (B) Schematic 
representation of the protein complementation jassay as described in the Examples. 
Introduction of a mutation at the DHFR interface (11 14A) was used to increase selection 
stringency (19). (C) Overview of the library design depicted as a helical wheel plot from 
the N- to the C-terminus (inside to outside. Thej randomized positions are indexed (*: 

equimolar mixture of Q, E, K, R, x: equimoltr mixture of N, V). The selected residues 

i 

from the predominant pair, WinZip»ATB1 (clore Fig. 1D), are next to the randomized 
positions in the respective box. <,D) Partial sequences (randomized positions of clones 
from the highest stringency selection which survived at least until passage 10 (clone #1: 
WinZip-A1B1, clones #2 to #10: WlnZio-A3Ei3 to WinZip-AHB11). Clone M (named 
WinZip-A1B1) was identified 18/22 times in passage 12 and 4/11 times in passage 10 
(the full amino acid sequences are shown in Fig. 11). 

Figure 2: (A) DNA constructs code for fusions between library proteins (showr as a- 
helical leucine zippers) and either fragment of murine DHFR (mDHFR). Fusions were 
created using either the wild-type or the mutar t mDHFR fragment 2 (llel 14Al£ ), yielding 
LibA-DHFR[1] and LibB-DHFR[ £:] or LibB-DHFR[2:l1 14A], respectively. (B) Principle of 
the mDHFR-fragment complementation assay: E. coll cells are cotransformec with both 
fusion libraries in minimat medium, in the presence of IPTG (for induction of expression) 
and trimethoprim (for inhibition of the bacterial DHFR). If the library proteins 
heterodimerize, mDHFR can fold from the individual fragments resulting in active 
enzyme and bacterial growth. 3oth mDHFR fragments must be present, and 
dimerization of the fused protenisjs f>ssen:ial, In order for ceil propagation to be 
possible. No growth is observec if any of these conditions is not fulfilled (19) . The 
surviving colonies are the result Df "single-steo selection" and can be directly analyzed 
by DNA sequencing. (C) "Competition selection" is undertaken by pooling colcnies from 
(B) in selective, liquid culture (passage 0 or P3), propagating the cells and di uting into 
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fresh selective medium for further passages. An aliquot can be plated and thu resulting 
colonies analyzed by DNA sequencing. 

Figure 3: Competition selection r.nd chain shuffling. (A) Approximately 1 .42 x I0 4 clones 
resulting from single-step, M14A-mutant selection were pooled (=P0) and competition 
selection was undertaken as described In Fgure 2C, and in the Examples. At each 
passage, some cells were plated and colony sizes were quantitated. (B) Qua ntitation of 
the colony sizes from (A). For comparative purposes, quantitation of colony sizes of cells 
transformed with DNA of WinZip A1B1 (but net passaged in liquid culture) is shown. (C) 
Quantitation of the colony sizes from passages of the chain shuffling e<periment: 
WinZip-B1-DHFR[2:l1 14A] + LlbA-DHFRfl]. Ir (B) and (C) the numbers of colonies were 
normalized such that passages could be directly compared. 

Figure 4: (A) Schematic representation of a leucine zipper pair visualized fiom the N- 
terminus illustrating e/g-interactions anc the hydrophobic core formed by the a- and d- 
positions. (B) Distribution of residues at the semi-randomized positions .hroughout 
selection. The number of zipper pairs soquerced is given in parentheses, save ' Before 
selection" where the theoretical distribution is reported. Each pair carries one sore a-pair 
and 6 e/g-pairs. Neutral e/g-pairs have one or both residues as Gin. In "Competition 
(I114A)" only clones from P6 tc P12 (not from earlier passages) were con side red for 
analysis. Thus, 37 individual clones were sequenced, and most of thi? resulting 
sequences were identical in two or more clones. The distributions were calculated 
according to the frequency cf sequence occurence (n=37). (C) Leucine zipper 
sequences obtained after competition selection and chain shuffling. Tie heptad 
positions (a to g) are followed l>y the heptad number (1 to 5). Invariant residues from 
GCN4 are underlined. Clear boxes indicate th e semi-randomized e- and g-positions and 
core a-position (a3). Circled residues were designed to contribute to hel x capping. 
Shaded residues were designed for the introduction of restriction sites. Oth^r residues 
are from c-Jun (UbA) or c-Fos (UbB). Arrows indicate putative e/g-interactions. 
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Figur 5: Efficiency of competition in a model s lection. The selection was set up by 
mixing known numbers of cells expressing eitner GCN4-DHFR[1 J/GCISW-DHF^: 1 1 1 4A] 
fusions or one of 7 LibA-DH FR[ 1 ]/LibB-DH ~R[2: 1 1 1 4A] pairs previously selected by 
single-step selection. The starting ratio was 2.9 x 10 4 : 1(GCN4 to Lib), Competition 
selection was undertaken as described in Figure 2C, and in the Exanrples. The 
appearance of the library pairs in the pool wa:; monitored by restriction analys is. A Pvull 
fragment (1138 bp) is unique to the LibB seq jence of the LibB-DHFR[2] plasmic, while 
another (762 bp) is from pRnp4 (repressor plasmid) and remains approximately 
constant. The bands were quantitated usinc the NIH Image gel analysis 1 unction to 
calculate the ratio of LlbB/pRep4 (indicated below each lane). 

Figure 6: (A) Deviation of observed e/gnnteraotions in the selected heterodinrer and the 
two putative homodimers from t™ statistically expected distribution in the absence of 
selection. Interactions are grouped in potentia ly attractive (E-K, K-E, E-R, R-E; left black 
bars), neutral (Q Q, Q-E, E-Q, Q-K, K-Q, Q-R R-Q; grey bars) and repulsive (E-J.E, K-K, 
K R, R-K, R-R; right black bars-, (i) Low stringency selection: clones were subdivided 
into those with an N-N pair in the core a-joosition (n=8) and those with an N-V or V-V pair 
(n=6), (ii) medium stringency selection: onl> clones with an Asn-pair in tfe core a- 
position were considered (n=23>, and (iii) highest stringency selection: clor es, which 
survived the competition selection at least up to passage 10 (Fig. 1 D) were c Misidered. 
These were analyzed counting each sequence once (not weighted, n=10) or according 
to their frequence of appearance (weighted, n=37). (B) Number of selected ps.irs having 
a certain difference of attractive (grey bars) or repulsive interactions (black bars), 
respectively, between the heterodimer and its constituting homodimers. The d-af in ition of 
(i), (ii) and (Iii) is as in (A). 

Figure 7: Positional distribution of amino acic s at each e- and g-position in s equences 
obtained from the highest stringency selection. The statistically expectei random 
occurrence of each amino acid at each position was subtracted from fre relative 
occurrence observed in the selection (Q eft (biack), E middle (grey), K/R right (black)). 
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Figure 8: (A) Determination, of the molecular weight of WinZip-A1B1 by secimentation 
equilibrium. The upper panel shows the resiciuals between measured data obtained at 
10 j^M peptide concentration at ;?S°C and data fitted as monomer (top), dimer (middle), 
or trimer (bottom). The lower paiel shows tho residuals for a dimeric fit to fr e data set 
obtained at 150 jjM peptide concentration, 10°C. (B) Native gel electrophoresis. 1: 
WinZip-A1 (pi 11.2); 2: WinZip-BI (pi 4.9); 3: WinZip-A1B1. (C) Urea titration of the 
heterodimer WinZip-A1B1 (■), and the homodimers WinZip-A1 (a) and WinZb-B1 (▼). 

Figure 9: CD-measurements of the synthesized peptides of WinZip-a1B1. (A) 
Temperature dependence of [s] 2 >2 for WinZip A1B1 (■), WinZip-A1 (a), Wiivip-B1 (▼), 
and the calculated average of both homodimers (- — ). (B) Dependence of T„ and AT m 
(-<> -) on pH, and (C) on salt. It must noted that thermal denaturation of WinZlp-A1 
was not completely reversible at 1 M salt concentration, and only 71% of the starting 
signal was regained. 

Figure 10: Sequencing profile of pools from passages of the chain shuffling VinZip-B1- 
DHFR[2:I1 14A] + LibA-DHFR[1]. Representative semi-randomized positions (see Fig. 4) 
were taken from a single competition experiment, such that the selection ratBS can be 
directly compared. The ratio of the individual t iplet codons (central three nucl30tides of 
each frame) was visually estimated (GAG = Gin; GAG = Giu; AAG = Lys; CGT = Arg; the 
equimolar random mix of the 4 codons results in the predominance of C £t the first 
position, A at the second and G at the third). Mixed positions are marked by (MNN),' 
positions where a single codon Is dominant (> 50%) are marked in lower case and those 
where the codon is clear (> 90%; are marked in upper case. For passages 0 2 and 8, 
two independent sequencing reactions wore performed, which yielded identical results. 

Figure 11: (A) Sequences of (p^y)pepiides WinZipAl to WinZipAH 

(B) Sequences of (poly)peplides WinZipBI to WinZipB1 1 
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The examples illustrate the inversion: 



EXAMPLES 

All reagents used were of the highest available purity. Sequencing was carried out either 
by cycle sequencing with fluorescence labeling (MWG-Biotech, Ebersberg, Germany) 
using a LiCor detection system or by automated sequencing with an ABI sequencer. 
Restriction endonucleases and DMA modifying enzymes were from Pharmacia and New 
England Biolabs. E. coli strain XL1 -Blue (Stratagene) was used for subconing and 
propagation of the libraries. E coli strain 3L21 harboring the lacP plasrnid P Rep4 
(Qiagen) was cotransformed with the appropriate DNA constructs for the survival 
assays. 

Abbreviations: CD. circular dichroism; mDHFR, murine dihydrofolate reductase; 
WinZip: dominant zipper pairs obtained from competition selection; WiiZip-A1B1: 
original pair selected, comprising peptide A1 from libraryA and peptide B1 frcm libraryB; 
WinZip-AI B2 and WinZip-A2B1 optimized ptJrs comprising the original partner A1 or B1 
and the new partner B2 or A2, respectively. 



Example 1: Selection of hetei o-associatlon peptides (see WO 98/3412(1, Exampl 

7) . 

1.1 introduction 

Here we present a strategy for library-vs-libary screening in Intact cells bfesec on th 
folding of murine enzyme dihydrofotate raductase (mDHFR) from coir plementary 
fragments (18, 19). DHFR was geneticall/ dissected into two rational!'/ dasigned 
fragments, each of which can be fused to a library of proteins or peptide s (Fig. 2A). 
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Members of one library which heterodimerize with a member of the other library drive 
the reassembly of the mDHFR f-agmems. resulting in ^constitution of enzymatic activity 
(Fig. 2B). Activity is detected in vivo using an £. co/Abased selection assay where the 
bacterial DHFR is specifically inhibited with trimethoprim, preventing bios/nthesis of 
purines, thymidylate, methionine and pantcthenate, and therefore cell division. The 
reconstituted mDHFR, which is insensitive to the low trimethoprim concentrat on present 
in selection, restores the biosynthetic refactions required for bacterial propagation. As a 
result, the interaction between Horary partners is directly linked to cell survival and 
detected by colony formation. We have previously demonstrated the utility of this 
strategy with GCN4 leucine dipper -forming peptides, as well as with larger 
heterodimerizing partner proteins (19) with K c s ranging between 3 and 160 n V) (20, 21), 
although the affinity limits have r ot been dete mined. 

1.2 Constructs for DHFR fragment complementation 

The DNA constructs encoding the /^terminal (1-107) and C-terminal (108-18'3) mDHFR 
fragments have been previously described (1«i). The vectors are variants of pasnids Z- 
F[1 ,2], encoding the N-terminal DHFR fragment, and Z-F[3] or Z-F[3:I1 14A], 
respectively, encoding the C terminal DHFR fragment with or without the 11 14A mutation 
(19), Briefly, each fragment was amplified ty PCR with appropriate unique flanking 
restriction sites and subcloned into a bacterial expression vector (pQE-32 from Qiagen). 
Each plasmid encodes an N-terminal hexahis idine tag, followed by a designed f'exible 
linker and the appropriate DHFR fracjmen\ uL'que restriction sites between the 
hexahistidine tag and the flexible linker allow subclonlng of the desired library. 

1.3 Library design 

Our goal was to select for -netabolically st« ble dimeric coiled-coils vith high 
heterospecificity. Thus, two libraries were designed to meet the requirements Df genetic 
diversity to prevent recombination, high helix stability and a high protabiity of 
complementarity, all within a reasonable library size (Fig. 1C). 

As templates for the outer, solvt .nt exposed residues (positions b, c, f) we trhose the 
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leucine-zipper regions of the proto-onoogenes c-Jun and c-Fos for IlbraryA and B, 
respectively, thus minimizing potential recombination despite the repetitive paUerri of the 
heptads. Indeed, no recombination was found in any of the 80 clones sequnnc€?d. We 
chose a helix length of 4.5 heptads as good compromise between stability an i size. For 
the hydrophobic core residues (positions a, i) we chose the residues of the parallel, 
homodimeric leucine zipper GCh 4 (Val at a, Leu at d). A single a-position in :he middle 
of each helix is often occupied Dy a polar residue, most often an Asn, which forms a 
hydrogen bond inside the hydrop hobic core (*2, 23). Replacement of this Asn pair by a 
non-polar one increases the stability significantly, but leads to helices .packing in 
different registers and orientations, as well a:> forming higher order oligomeis (24-26). 
Since we could not ascertain a priori whether! igher specificity or stability would be more 
advantageous, we included both' by allowing Asn and Val at the core a-positicn with 
equal probability. A difficult probltim in library resign is to encode only the desired amino 
acids with a predetermined ratio. We solved this problem by using defined tri lucleotide 
mixtures in the oligonucleotides, wnere each trinucleotide codes for one specific amino 
acid (27). 

The solvent-accessible residues at the e- aid g-positions can form interhslical salt 
bridges or hydrogen bonds which can contribute to stability and heteromeric specificity 
(13, 14, 28, 29). Based on thes9 results anc on commonly occurring amino acids at 
these positions (30, 31), we simu teneously randomized all eight e- and g-pos tiors with 
equimolar mixtures of Gin, Glu Lys, Arg, also using trinucleotide codonii in DNA 
synthesis. Including the Asn-Val combination at the core a-position, each library had a 
theoretical diversity of 1 .3x1 0 5 . 

To increase the stability of the helices, we irtroduced helix capping residues in both 
libraries to saturate the missing hydrogen boncis at the helix ends with their side chains. 
Based on studies of helix-capping propensities in proteins (32, 33) and peptides (34, 
35), we chose T-X-X-Q (Ncap-N1-N2-N3) for I braryA, and S-X-X-E for library 3. The C- 
cap has only a minor effect on hf !li:< stability (H5). As Gly has a high preferen ;e for the 
C-cap position (32, 33), we adiied a C terminal Gly. This may contribute to helix I 
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termination or extend the linker which connec ts the coiled coil to the DHFR fragments 
(Fig. 1B). 

1 .4 Library synthesis 

Trinucleotide codons (27) were used to cude for randomized positions, all other 
positions were made with monon jcleotides. 

LibraryA: TACTGTGGCGCAAC""GNNNGAAMNNGTGAAAACCCTTNNNGCTNNNXXX- 

TATGAACTTNNNTCTNNNGTGAGiCGCTTGNNNGAGNNNGTTGCCCAGCTTGCTA 

(encoding VAQLXEXVKTLXAXilYELXSXVQRLXEXVAQL, wherein X rep esents a 

mixture of E, K, Q, and R, and wherein 2 represents a mixture of N and V); 

libraryB: CTCCGTTGACGAACIGNNIMGCTNNNGTTGACCAGCTGNNNGACNNNXXX- 

TACGCTCTGNNNACCNNNGTr3GCAGCTGNNNAAANNNGTGGAAAAGCT<3TGATA 

(encoding VDELXAXVDQLXDXZYALXI'XVAQLXKXVEKL, wherein X represents a 

mixture of E, K, Q, and R, and wherein Z repre sents a mixture of N and V) 

(NNN = equimolar mixture of the trinucleotides AAG, CAG, GAG, CGT; XXX = equmolar 

mixture of the trinucleotides AAT GTT). 

Generation of the second strand and introduc ion of Sail and Nhel restriction sites were 
achieved by PCR using the primers prA-fwd: GGAGTACTGGCATGCAGTCG ACTACT- 
GTGGCGCAACTG and prA-rev GGACTAG'rACCTTCGCTAGCAAGCTGGCiCAAC or 
prB-fwd: GGAGTACTGGCATGCAGTCGACDTCCGTTGACGAACTG and p-B-rev: 
G GACTAGTG CTAGCTTCTG AC- \G CTTITCC AC , respectively. This resultec in a 142 
bp double-stranded oligonucleotide for either I brary. 

1 .5 Expression plasmids 

LibraryA and B were both digested with Sail and Nhel, gel purified and ligsted to the 
appropriate vector (Fig 2) yielding the plasnids LibA-DHFR[1], LibB-DHFF![2], LibB- 
DHFR[2:I1 14A] (Fig. 2A). After subdoning, the resulting linker between either ibrery and 
DHFR fragment was: A(SGTS) 2 STSSGI for LibA and SEA(SGTS) 2 STS foi LibB. To 
achieve maximal library representation, the ligation mixes were individually 
electroporated into XL1-Blue cells and selected with ampicillin on rich medium (LB). A 2- 
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to 7-fold over-representation of each library was obtained. The resulting co onies were 
pooled and the plasmid DNA purified sjch that supercoiled plasmid DNA was obtained 
for cotransformation. The supercoiled DNA was cotransformed in BL21 cells yielding 
about 4x1 0 6 double-transformantsi. We used BL21 cells with a transformatio n ef ficiency 
of no less than 5 x TO 7 transformants per mci of DNA using 200 pg of DNA : or 2 x 10 7 
transformants per mg using 500 ng of DN^. In cotransformations, the occurrence of 
double transformation was calculated as; the riumber of colonies growing undor selective 
pressure with trimethoprim (described below) divided by the number growing in the 
absence, when cotransformed with equal amounts of each DNA of a given, p e-selected 
pair. In order to verify that the lit rery populations encode the designed amine acids with 
the expected frequency, single clones from each library were randomly picked and 
sequenced before selection. No statistically significant biases were detected. Seventy to 
80% of each library had no mutations or frame-shifts, and thus the librar/-vs-library 
combination yielded approximately 50% co-rect sequence combinations. Thus, the 
experimental library-vs-library size of correct pairs is estimated as 2x1 0 6 . 

1.6 Selection procedure. 

Three selection strategies were tested here, € ach having a different level of stringency. 
In the lowest stringency selection, we screened two expressed libraries against each 
other in a single-step selection (F ig. 2B), when cells cotransformed with comp lementary 
libraries were directly plated on selective medium plates (M9 medium win 1 //g/ml 
trimethoprim), and resulting colonies were analyzed, thereby identifying all nteracting 
polypeptide partners. In the second strategy, we increased the selection stringency by 
using a mutant DHFR fragment (lle114Ala) containing thejdestabilizlng I114A mutation 
in DHFR[2] which occurs at the interface between both DHFR fragments. Thh ; mutation 
prevents stable reassembly of DHFR from its fragments (19) and should thus require 
more efficiently heterodimerizing, as opposed to homodimerizing, interacting partners to 
drive enzyme reconstitution. Fir ally, we introduced competitive metabolic selection, 
where clones obtained with the second strategy were pooled and passage j through 
several rounds of competition selection, in order to enrich for the optimally 
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heterodimerizing partners. Thereby, tho most stable heterodimers should have higher 
mDHFR activity and thus a growth advantage 

Selective pressure for DHFR was maintained throughout all steps by inhibit ng the 
bacterial DHFR with trimethoprim (1 mg/ml) in minimal medium. ArnpcillTi and 
kanamycin (100 mg/ml and 50 mg/ml, respectively) were also included in all steps to 
retain the library plasmids end the lacf q repressor-encoding plasmid (pRep4), 
respectively. Expression of the proteins- was nduced with 1 mM IPTG. When selecting 
on solid medium, growth was alJowed for 45 Ivs at 37°C. 

When it was necessary to control precisely tho starting number of cells in a competition, 
the number of viable cells in ihe starter cultures was quantitated as fo.lows. The 
appropriate clones were propagated in liquid media under selective conditions and dilute 
aliquots were frozen at -80°C witi 15% glycerol. One aliquot for each clone was thawed 
and plated under selective conditions, and the colonies counted after 4£ hr.3. The 
volume of cells to use for P0 was then calculated, such that each clone should be over- 
represented by a factor of at least 2000. Colony sizes (in Fig. 3) were evaluated using 
the NIH Image Particle Analysis Facility. Whe n selecting in liquid medium, the starting 
O.D. (600 nm) was either 0.0005 or 0.0001. Calls were propagated either in Erienmeyer 
flasks or in a 10 liter New Brunswick fermentor, depending on the volume required to 
ensure adequate representation of all clones present, at 37°C with shaking, or stirring at 
250 RPM. After 10 to 24 hrs, O.D. (500 nm) reached 0.2 to 1.0 and cells were 
harvested. 

1.7 Single-step selection 

As a first step in selection of he:erodimerizing leucine zippers, a single-step selection 
was undertaken, using the wild-type mDHFR Iragments, by cotransforming th 3 libraries 
LibA-DHFR[ 1] and LibB-DHFR[2= and plating on selective media (Fig. 2B). This strategy 
applies only a low stringency of selection to the potential pairs, thus many library 
combinations were expected to be selected. Approximately 1.7% of the resulting 
ampicillin-resistant cells were doubly transformed, harboring (at least) one plasmid from 
each library when using 5 ng of each DNA, or 8% were doubly transformed w ien using 
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20 ng of each DNA, as seen from control transformations (calculated ai described 
above). Of the doubly transformed cells wiich harbor no mutations or fiame-shifts, 
approximately 35% formed colonies under selective conditions (Table 1). This result 
immediately demonstrates that even with re stively low stringency of selection, only a 
fraction of the possible combinations of :he two libraries allows zipper heterocimerization 
leading to efficient mDHFB reassembly. Fourteen colonies resulting frcm two 
independent ^transformations were picked and the sequences encoding ^he zippers 
were determined. Even under these low slringency conditions there exis: important 
sequence biases in these sequences relative to the unselected ones (Fig. 4B). A 
reduction in same-charged e/g-pairs from 31.3% (unselected) to 19% (selected) and an 
increase in opposite-charged pars from 25% (unselected) to 31% (selected) /vera seen. 
As well, a strong enrichment of N-N pairing at the core a-position (25% urn elected vs 
57% selected) was observed. The chara :teristics that have been enlched are 
consistent with the selection of stable leucine zipper heterodimers. 

1.8 Increased stringency: use of the mDHFR Ile1 14 Ala mutation. 

We repeated the single-step selection, using tie lle114Ala mutant of mDHFR ;18, 19), in 
order to increase the stringency of selection. We reasoned that only library partners that 
form the most stable heterodimers can co npensate for the reduced ab lity of the 
mDHFR(lle1 1 4Ala) fragments tc fold into active enzyme, resulting in higher enzyme 
activity and growth rates. When bacteria we-e cotransformed with LibA-DHFR[1] and 
LibB-DHFR[ 2:111 4A], we observed a 50-fold decrease in the number of cole nies upon 
selective plating compared to tie wild-type DHFR fragments (Table 1). Iwenty-five 
colonies were picked from 3 incependent co transformations and the DNA sequences 
were analyzed. The increase in selectivity was concomitant with an extremely strong 
selection for N-N pairing at the core a-position (92%; Fig. 4B), illustrating that the 
specificity of in-register parallel alignment provided by N-N pairing is more high ly favored 
under these in-yivo selection conations than tne higher stability afforded by V V pairing. 
Reassembly of mDHFR from its fragments requires that in the final structun, the two 
fragment N-termini be brought close enough together to allow native-like rtfolcing of 
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DHFR (see Fig. 1) (19, 36). The peptide linkers that connect the library sequ« nces to the 
DHFR fragments must be sufficiently flexible to allow DHFR to fold from its fragments, 
but not so long that any C-terminal to A/-terminal orientation of the final folded leucine 
zipper would be allowed. As a result of this structural requirement, parade in- register 
heterodimerization of the library peptides is the only configuration possible. Other biases 
in these sequences were also more pronounced than with the wt DHFR frag-nerits (Fig. 
4B). In particular, an additional increase in opposite-charged e/g-pairs from 1% to 37% 
was seen. In one case, a point- nutation resulted in a single clone (1/25) with a V-T pair 
at the core a-position. 

1.9 Competition selection: Efficiency of selection 

To further increase the select on pressure we applied the principle of competition 
selection. We reasoned that, among selected zipper pairs, those which result n more 
stable heterodimerization will allow the most efficient enzyme reconstitution leading to 
higher DHFR activity. If DHFR activity is limiting for growth, the higher actvity should 
result in more rapid bacterial propagation, hence these cells would become enriched in 
a pool. Thereby, after sequent al rounds of growth-competition, subtle differences in 
growth rate can be amplified, increasing the stringency of selection relative to the single- 
step selection. 

To determine the rate at which competition nan enrich for particular partner pairs, we 
first set up a model competition with a limited number of clones as described in Figure 
1C. The initial cell mixture (PO. contained hown amounts of viable cells expressing 
either GCN4- DHFR[1]/GCN4-DHFR[2:I1 14AJ or one of seven LibA-DHI-R[1 J/LibB- 
DHFR[2:I114A] pairs previously obtained in a single-step selection of those libraries, 
mixed at a ratio of 2.9 x 10 4 : - (GGN4 : lib ary clones). Productive association of the 
homodimeric GCN4 pair should occur only 50% of the time versus up tc 1C0% for 
heterodimerizing library clones, thus is disadvantaged. Within 3 passages, the library 
pairs were already visibly enriched (Fig. 5), and after 5 passages the measured ratio 
between a restriction fragment indicative of the library and a constant fragment from the 
repressor plasmld had reached its maximiun, showing that enrichment was maximal. 
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Colonies resulting from passago 9 (P9> were sequenced. No GCN4 leucine 2ippers were 
present among 24 sequences analyzed. Therefore, enrichment of the librar/ pairs over 
GCN4 by a factor of at least 24 ;< 2.9 x 10 4 = 7 x 10 5 was achieved. Four )ut of the 7 
library clones initially present survived un-ll P9, with varying distributior s (data no 
shown). The experiment was elso repeated at a lower starting ratio of GCISI4 and the 
same library clones were enriohed, consistent with their enrichement being truly the 
result of selection (and not of i.nrepresentat ve sampling). This indicated that selection 
among the pre-selected clones was not as n pid as that seen between pre-s alected and 
GCN4 zippers, but that the smaller differences between the pre-selected ones can still 
be amplified m selection. These? results demonstrate that there is a direct li ^k between 
reconstitution of mDHFR and growth rale. 

1 .1 0 Competition selection for optimal pairs 

Our ultimate goal was to select : or the "best" among the zipper pairs obtained by single- 
step selection. We obtained a large Initial number of clones by cotransformhg bacteria 
with 0.5 mg of DNA each from LibA-DHFR[1 1 and LibB-DHFR[ 2:11 14A]. Approximately 
50% of cells were at least doubly transform* d (52% + 10%, average of 2 ir dependent 
control experimentsTc^uiated as described n the Experimental Protocol). We ootained 
approximately 1.42 x 10 4 clones on selective medium, which arise from a 1.4 x I0 2 -fold 
selection factor (see Table 1), and were thus selected from (1.42 x 10 4 ) x (1.4 x 10 2 ) = 
2.0 x 10 6 library-vs-llbrary cotransformeints. These were pooled and passaged. There 
was a clear increase in colony sizes with su Dsequent passages, indicating tiat faster- 
growing clones were taking ove- (Fig, 3A, B) At P12, the colonies are homogeneously 
large, showing similar growth rates among t ie clones. Twenty-two individu al colonies 
from P12 were picked and sequenced, as wel as 1 1 from P10 and 2 from eac h previous 
second passage. A single pair (WinZip-A1E1, composed of WinZip-A1-DhFR[1] and 
WinZip-B1-DHFR[2:l114A]) was identified 18/22 times (82%) in P12, 4/11 (3S%)->n P10, 
but not in previous passages (Ffg. 4C). While other sequences were found in early and 
late passages, none was as enrtehed as Win;!ip-A1B1. In order to verify that the growth 
rate recorded after competitor (P12) was independent of bacteria-specific factors 
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resulting from passaging, we cot -ansforrned DNA from a pure clone of WinZip-A1 B1 into 
fresh bacteria. The colony size distribution is similar for P12 and for the transformants 
(Fig. 3B), illustrating that the growth rate is a direct product of mDHFR reconstitution 
directed by the WinZip-A1BT pair. The sequence bias observed at the core-a position 
was yet stronger here: only N-N pairing was recorded at the core a-position. When the 
biases at the e/g-positions wore calculated according to the occurrence of each 
sequence (n=37), there was no significant cnange in opposite charged pairing (37%), 
while a small increase in same-ohargec pairing was observed (from 23% to 26%) as a 
result of the two same-charged pairs which occur in the predominant WinZip-A1 Ell (Fig. 
4B, C). However, when each unique {sequence was considered only onco (n=10) a 
further increase of opposite-charged e/g-pairing was observed. 

Example 2: Analysis of clones resulting from library screening 

2.1 Introduction 

Clones resulting from the three selection strategies with increasing stringencies were 
analyzed and compared: (i) lowest stringency: 14 clones analyzed, (i) medium 
stringency: 25 clones analyzed (see Table 2), (iii) highest stringency: 41 clones analyzed 
from various passages. The last passage (F'12) yielded a population dominated by a 
single pair of coiled-coil sequences, WinZip-A1B1, as described above (Fig. 1C, clone 
#1 in Fig. 1D), which was biophysicalJy analysed (see below). The sequences of clones 
surviving at least up to passage 10 are reported in Figure 1D. By comparing the 
selected clones from the three fitrategies, wc analyzed the preferences for the sore a- 
position, the distribution of e/g- pair combinations, and tKe presence of any bias for 
certain amino acids within the individual helicos. 

2.2 Selection in the core a-p osition 

Sequencing of 16 clones prior to selection showed an equal distribution of Asn and Val 
at the core a position. After selection a strong bias was found toward Asn-p airs, which 
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became stronger with increasing selection stringency. No bias was seen between Val- 
val or Val-Asn combinations. An Asn-pair v,as found in 57% (n-14) of tie lowest 
stringency, in 92% (n=25) of the medium and in 100% (n=41) of the highest stringency 
selection. Thus, the specificity gained by this polar interaction clearly outweighs the 
more energetically stable Val-pairs. This seems to be a very important feature, since a 
strong selection occurred even at the lowest stringency. 

2 3 Selected e/g-pairs in hetero- and putative homodimers 

All selected clones must form a heterodimer to reconstitute DHFR activity, ho wever. the 
heterospecificity, and thus the ratio of hetemdlmers (libraryA/libraryB) to hcmodimers 
(llbraryA/libraryA and HbraryB/libtaryB) can va.y, and a mixture of both may stl I generat 
sufficient amounts of active enzyme to allow cell propagation under low stringency 
conditions. We analyzed the average occurence of all e/g-pair combinations in the 
heterodimers and the putative hcmodimers ar sing from the various selections in relat.on 
to the random statistical distribution (Fig. 6A). The selected heterodimers show on 
average more attractive and less repulsive interactions than expected in a random 
population, indicating selection or stability. This trend, although increasing vith higher 
stringency, is already observed in the lowest stringency selection, indica ing that a 
certain threshold of stability is needed to induce enzyme dimerization. S»lec:ion for 
heterospecificity is achieved by a higher stability of the heterodimer relati/ely to the 
homodimers and is only obse.ved in the medium and highest stringency seections 
using the destabilizing I114A l)HFR[21-mutant (compare Fig. 6A, (i) vs (ii) and (iii)). 
interestingly, this effect is more, pronouncec for libraryA homodimers than for iibraryB 
homodimers, and the biophysical character zation of WinZip-A1B1 indicated that the 
libraryA hdmodimer is more stable (see below), and Thus might have a stronger 
influence on titrating out fragments than the I braryB homodimer. 

To determine the degree of heterospecificity achieved in the various, sanctions, the 
relative stability of heterodimer versus nomodimer was estimated for each single done. 
We calculated the difference of attractive or repulsive interactions, respectively, between 
the heterodimer and the average of the two corresponding homodimers and displayed a 
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histogram of pairs as a function Df this differe ice (Fig. 6B). The results clearly show that 
heterospecificity is achieved not only by a decrease of repulsive interactions out also by 
an increase of attractive interactions in the heterodimer relative to the homodimers. 
However, the lowest and median stringency selection yielded still a certain fraction of 
pairs with no difference or evon a reversed ratio, whereas the highest stringency 
selection exclusively yielded pai-s with distinct heterospecificity. In addition, n no case 
were more than three repulsions found in -.he heterodimers, although in a random 
combination 8% of all pairs snoud have 4 to e repulsions. 

2.4 Positional distribution oV the selected amino acids 

Intrahelical electrostatic interactions ce.n inf uence stability and may even promote 
selection of apparently repulsive e/g-pars. Interactions with the helix macro jipole, for 
example, can modulate stability in coited co Is (37). Indeed, we observed a b as for 
negatively charged and neutral amino acds in the N-terminal part and positive; y charged 
amino acids in the C-terminal part (Fig. 7). This positional preference ma/ a: least 
partially compensate the loss of stability resulting from a repulsive e/g-intei action. In 
addition, interactions With adjacent residues on the outside of the helix b- and c 
positions) may influence the contributions of charges at the e- and g positions (38). This 
may explain why at position e1 h libraryB =t negatively charged amino a)id is not 
favored, contrary to the expected counterbalan 3ing of the helix dipole, since this position 
is adjacent to two aspartates in positions b- and b2. On a more general nota, the 
predominantly selected sequence with the residues from c-Jun at the outer positions 
(libraryA) bears remarkable similarity to the e- and g sequences in the naturally- 
occurring c-Juri. ' * ■ 

2.5 Library complexity 

Although the predominantly selected sequence pair WinZip-A1 B1 showed all the desired 
properties in vivo as well as in vitro (see below), we were not able to cover all theoretical 
library vs-library combinations in our selection. Nonetheless, we covered all possible 
electrostatic interaction combinations (+/+, -/-. +/-, +/n, -/n, n/n; n=neutral) in e.ll six 
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interacting e/g-positlons, when Grouping the ;ore a-position into favored (A? n-Asn) or 
disfavored (Asn-Val, Val-Val) combinations. Tt is reduces the theoretical library size from 
1.7x10 10 to 9x1 0 4 possibilities, which was wnll covered by the experimental library of 
2x1 0 6 . It is therefore likely that WinZip-A1B1 contains the most important features for 
stability and heterospecificity. Furthermore, the random probability of finding pairs with 
no repulsive interactions was 1:40, and with solely attractive interactions was 1:1.6x10*. 
Thus, our selection covered a ropresemative sequence space and the sam*-charged 
interactions in WinZip-A1B1 are not a result of incomplete library sampling bir: must 
have more subtle reasons, including in-vivo factors, which we cannot fully address. 
Furthermore, in the medium as v/ell as h the highest stringency selection 13 out of 38 
pairs sequenced had no repulsive e/g pairs, but none competed successfully against 
WinZip-A1B1 in the selection. 

Example 3: Biophysical chars cterizaition of WinZip-A1, WinZip-B1 and WinZlp- 
A1B1 

3.1 Secondary structure and oligcimeriiatiorT state of the predominant pair 
WinZlp-A1B1 

We investigated the stability ard specificity of the predominantly selected peptides 
WinZip-AI and WinZip-B1 alone and in an equimolar mixture (WinZip-A1 Bi ). All 
experiments were performed with N-acetylated and C-amidated synthetic peptides. 

3.1.1 Peptide synthesis and purification 

The peptides WinZip-A1: Ac-STTVAQl.EEK/KTLRAQNYELKSRVQRLREQ\^AQLAS^ 
NH2 and WinZip-B1: Ac-STSVDE:LQAI=VDC'LQDENYALKTKVAQLRKKVEKLSE-NH2 
were synthesized (Applied Biosysiem 431 A) and purified by reversed-phaj;e MPLC. 
Electrospray mass spectrometry confirmed pjrity and identity of the peptides with a 
mass deviation of less than 1 Ds. Peptide concentrations were determined bf tyrosine 
absorbance In 6 M GdnHCI (39). 

3.1.2 Circular dichroism measumments 



28 

All peptides formed stabl o-helioal coiled coils as demonstrated by CD-spect. a. 
CD spectra were recorded at SX' at a total peptide concentration of 150 f iM (1 mm 
cuvette, Aviv 62DS spectrometei). The standard buffer was TO mM K 2 HPO4/K-J 2 P0 4 , pH 
7.0, 100 mM KF; salt concentration and pH were varied as indicated In the respective 
experiments. Thermal denaturations were measured at 222 nm in steps of 2.J;°C (2 min 
equilibration, 30 s averaging). Thermal :ransi:ions were >91% reversible except where 
indicated. Apparent T m were determined by least-squares curve fitting of the 
denaturation curves (40), assuming a tw< .-state model. AT m was calculated as 
T m (WinZip-A1B1)-y2rTm(WinZip-A1.)+T m (Win2iQ-B1)j. Urea denaturation equil brie, were 
determined at 20°C by automated titration of native peptide with denatured pe ptide in 6 
M urea (30 /jM Win2ip-A1 or VVinZip A1B1, respectively, or 60 //M WinZip-B1) 
measuring the CD signal at 222 nm (300 s equilibration, 30 s averaging). K D values were 
calculated by linear extrapolation to 0 M cenaturant assuming a two-stete tiodel 
(K D =[unfolded monomer] 2 /[folded dimer]). 

The helical content was in the range of 90°/ > (WinZip BT) to 100% (WinZio-AI and 
WinZip-A1B1). Peptide WinZip-A1 as well as the mixture WinZip-A1B1 (Fig. 4A) were 
J?[ meric _ at 10 °C and 25°C over a concen tration range fro m 10 fjM to. 150 uM a s 
determined by equilibrium sedimentation. WinZip-B1 was partially unfolded as :;een both 
by CD (Fig. 8C at 0 M urea) and equilibrium se limentation. 

3.1.3 Equilibrium sedimentation. 

Equilibrium sedimentation experiments wer? performed using a Beckman XL-A 
Ultracentrifuge. Absorbance wtis monitored at 220 and 275 nm at peptide 
concentrations of 10, 50 and 150 in 10 mM K2HPO4/KH2PO4, pH 7.0, 100 mM KCI. 
Partial specific volumes and solvent den.oities were determined as described 41). The 
data sets were fitted to single molocular masses of monomer, dimer and tiimer. 
Equilibrium sedimentation indicate d a mixture c f monomers and dimers, with d€ creasing 
amount of dimer at increasing temperature. 

3.1.4 Structural stability and hete ospecificity 

Thermal denaturation studies at neutral pH (Fig. 9A) revealed apparent T m valuas of 
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28?C (WinZip-B1), 49°C (WinZip-AI), and (\5°C for the equimolar mixture of both 
(WinZip-A1B1). The large difference between :he denaturation curve of the hetercdimer 
and the average of the curves from WhZip-A1 and WinZip-B1 indicates that 
heterodimers form preferentially at equilibrium. This high heterospecificit/ is best 
reflected in a large and positive ATm value, an« J indeed, we observed a AT m of 16.5°C. 

To probe the mechanism of spechictty, tho effepts of pH (Fig. 9B) and ionic strength (Fig. 
9G) were investigated. All peptides were more stable at high pH, most likely because all 
have at least one e7g-pair with two positive charges which are neutralized at high pH. 
The increased stability of WinZip-B1 at low pH could be due to the shielding of 
electrostatic repulsions resulting from its high concentration of acidic residues. However, 
the AT m is positive over the wholo pH range indicating heterospecificity. The maximum 
degree of heterospecificity was observed at ne utral to slightly basic pH, consistent with 
the intracellular pH of E. coli (42 . High salt concentrations increased the absolute T m 
values (Fig. 9C), presumably due to increased hydrophobic interaction or reduced 
electrostatic repulsion. However, the AT m is reduced compared to low salt 
conc entrations (0-100 mM), most likely d ut* to the decr eased Infl uence of ion ic 
interactions at higher ionic strengt h 

Interestingly, the overall stability did not correlate directly with the number of potentially 
repulsive e/g-interactions. The homodimer WirZip-B1 has two same-charged on oairs, 
but is significantly less stable thai the homod mer WinZip-A1 with four same -charged 
pairs (Fig. 9). Since the overall helical propensity is comparable for both peptides 
according to (43), the difference is probably <iue to intrahelical interactions. LibraryB 
might be destabilized by its high local concentration of acidic residues at the N-ierminus. 
This may also explain why llbraryA homoclirrien; have generally more repulsive ana less 
attractive e/g-interactions than libraryB nomcdimers, since the e/g-position 5 pray a 
bigger role in the destabillzation of the intrinsically more stable libraryA in order to 
reduce homodimerization. 
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3.1 .5 Native gel electrophoresis 

Heterospecificity also was observed by native jel electrophoresis (Fig. 8B). 

Gels (7.5% polyacrylamide (19:1), 375 rnM B alanine acetate buffer, pH 4.5) were run 

with 500 mM B-alanine acetate butter, pH 4.5. Samples (-10 /yg peptides per lanei were 



pH 4.5, 0.2% methylgnsen, 10% 
min, run for 2-3 h at 5°C, and fixed 



two-fold diluted with 600 mM 3-aianine acetate, 
glycerol. Gels were prerun at 100 V for at least 45 
with 2% giutaraldehyde or 20% (vv/v) TCA, resoectiveiy, before staining with Coomassie 
blue. 

To obtain a significant migration, an acidic buffer (pH 4.5) had to be used, and thus 
conditions where the heterosperiificity is the lowest (ATm of only 7°C, conpared to 
16.5°C at neutral pH, Fig. 9C). Nevertheless, even under these stringent conditions 
heterodimers were obtained almost exclusively from the equimolar mixture (Fig. 8B), 
suggesting very high heterospecificity at neu ral pH, and thus indicating how strongly 
heterospecificity was selected for 

3.1^6 K D determination 

Dissociation constants of the pe ptides wore die rived from equilibrium urea denaturations 
(Fig. 8C). The heterodimer Win Zip A1E51 wes the most stable species with a K D of 
approximately 24 nM, while the homodimer Wi iZip-A1 had a K D of approximately 63 nM. 
The accuracy of the K D determination of Win21p-B1 is lower since it is alreacy partially 
unfolded without denaturant (seo above). Th^ K D was estimated to be In the *0' 5 M 
range. Calculations were confirmed by determining the K D values from thermal 
denaturation curves by a van't Hoff analysis, assuming as a first approx maiion a 
constant aH (40). We found reasonable agreement to the data obtainec by urea 
denaturation with a maximal deviation of K 0 by a factor of 2.6. 

3.1 .7 Comparison to other coiled coils 

Designed coiled coils are usually only j jdgecl for being stable in vitro and, in certain 
cases, for heterospecificity, whereas naturally-occurring coiled coils must also function 
reliably in a cellular environment. Similar demands are imposed on our selection end on 
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further in-vivo applications, and therefore WinZip-A1B1 is best compared with other 
naturally-occurring coiled coils. The hornodinieric coiled coil of the yeast trc/iscription 
factor GCN4 has an equal or slightly higier Tm, depending on the le ngth and 
concentration of the peptides chosen (44, 45). The N terminal homodirneric coiled coil of 
the APC protein has a Tm lower by at least 9°C than WinZip-AlBl (46). The 
heterodimeric coiled coll from c-Jun/c-Fos shows comparable Tm and AT m values (2). 
However, those data were derived from disulf de-bridged peptides. The coilec coil from 
c-Myc/Max also heterodimerizes to a fairly high extent, but peptides of ccmparable 
length have a T m of only 31 °C and a K D of 60 /.M (25°C) (45), whereas our WirZip-AIBI 
has a Tm of 55°C and a K D of approximately 24 nM (20°C). Thus, WirZip-A1B1 
compares successfully with naturally-occurring coiled coils and will therefor* be very 
useful for a variety of in-vivo applications. 

Example 4: Chain shuffling of the WinZip-A~ B1 sequences 
3.1 Introduction 

In the above experiment, WinZip ATBI was sulected from a sample representing 2.0 x 
10 6 library- vs-library cotransformants. As the theoretical library-vs-library diversity is 
(1.31 x 10 5 ) 2 = 1.72 x 10 10 , approximately 0 01% of the library-vs-iibrary space was 
sampled. However, we obtained e. very high coverage of either single library (theoretical 
complexity of 1.31 x 10 5 ), where the probability of ail members being presert at least 
once is P=0.973. Thus, each polypeptide samoled only a small portion of the opposite 
library (2.0 x 10 6 / 1.31 x 10 5 = 15.4 polypeptides of the -other library with P=0.999, 
assuming equal transformation rates for boh libraries) and it is likely that better 
combinations for the WinZip-A1E»l peptides nay be found. Using WinZip-A~ B1 as a 
partially optimized starting point, we combined each of the two Win.!ip-A1B1 
polypeptides with the opposite library (WinZip A1-DHFR[1] + LibB-DHFR[2:M I4A] and 
WinZip-B1-DHFR[2:l1 14A] + LibADHFR[i])! 
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3.2 Chain shuffling 

DNA from the WinZip-A1B1 clone was isolated and retransformed into bacte-ia in order 
to obtain clones carrying either plasmid WinZip-A1-DHFR[1] or YVInZip-B1- 
DHFR[2:I114A]. A pure clone (fcr each) was olectroporated with the appropriate library. 
Library representation was calculated by comoarison with control transf ormat ons of the 
same cells with DNA from the othe- WinZip-A" B1 polypeptide (calculated as the number 
of colonies growing in the presence of trimethoprim divided by the number growing in 
the absence). 

Single-step selection yielded pro-selected pools for either competition. In both cases, 
the library (1.3 x 10 s ) was over-represented by a factor of 24 and 14, respec:ively, and 
the probability that all members were present at least once as partners of the 'constant" 
peptide is P > 0.999 and 0.882, tespectively. With passages of selection competition, a 
clear increase in colony sizes was again obse-ved, indicating that faster-grow ng clones 
were taking over (Fig. 3C). 

3.2 Analysis 

At PO and each s econd passage, D NA fro m the, entire pool of cell s was seq jenced in 
order to follow the rate of evolution of each lib-ary against a constant partner. Figure 10 
illustrates the results from representative sem -randomized positions, ft is cle^r that the 
rate of selection is not constant at all positions: some positions showed a dominant 
residue ( > 50%) already at P4 and clear selec ion ( > 90%) at P6 (see position e2) while 
others remained mixed (<50%) until P6 and became clear only at P10 (see position g3). 
This was observed in both selections. The sequences from individual colonies were 
analyzed. In both selections, a predominant clone was identified (Table 1 and Fig. 4C), 
which is similar, but not identical, to the originally selected WinZip-A1 B1 oair. The 
selection of the predominant clcne WiriZipA;?B1 (selection of LibA-DHFR[1] against 
WlnZlp-BI- DHFR[2:I114A]) was achieved before P10, as P10 (4 clones analyzed) and 
P12 (4 clones analyzed) revealed only this <lone. The selection of the predominant 
clone WinZipA1B2 (selection of LibB-DHFR[2: 11 14A] against WinZip-A1-DHF ^1]) was 
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clear but not complete after 12 passages, as it was identified 4/6 times in P'2 and 3/5 
times in P10. 

Analysis of the biophysical properties of the peptide pairs selected in the chah shuffling 
experiments, performed >as described in Examples 2 and 3, indicated that the observed 
e/g-interaction pattern is similar to that of WinIIip-A1B1 . 

Conclusion 

We applied a fast and simple strategy, a llbrary-vs-library selection with the fragment 
complementation assay, to. select for a metabolically stable, dimeric aid highly 
heterospecific coiled coil with high affinity. Comparison of the outcome of various 
selections performed with different stringencies revealed insight into which oroperties 
are selected already at lower stringency, and are thus the most crucial for successful 
heterodimerization, and those which only become apparent at higher selection , 
stringency, and thus represent a more subtle optimization- The most striking selection 
occurred at the core a-position for Asn-pairs, revealing that structural uniqjensss is 
essential for efficient and selective hecerodimerization. Furthermore, comparison of 
selected e/g-pairs from hetero- and putative homodimers indicated s election for stability 
even at the lowest stringency, whereas selection for heterospecificity vas more 
pronounced at higher stringency. Heterospecif city was achieved not only by decreasing 
the numbers of repulsive e/g-intet actions but a. so by increasing the number of attractive 
interactions in the heterodimer relative to the h ^modimers. 

The selection for heterospecificr.y (and thus against homodimers) may be a unique 
feature of this selection system. Not only is act ve enzyme exclusively formed ty parallel 
heterodimers, but homodimers ard higher oligomers are likely to have a negative effect 
by unproductively wasting fragments and perhaps even harmfully accumulating non- 
functional enzyme. Dimer stability in turn, is dependent not only on e/g-pair interactions, 
but also on helical propensity, intrahelical interactions and helix dipole stabilization. 
Indeed, our analysis revealed tha; tne most successful variants do not simply consist of 
complementary charges in the e/g-posiiions, but show a more complicatec pattern, 
presumably fulfilling a variety of naturally conflicting demands on the sequence, whose 
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optimum would have been extremely challenging to predict 



The biophysical characterizatior of tho predominantly selected pair Win £ip- A1 B1 
reve aled the formation of a stab e dimeric coiled coil with very high heterosoecificity, 
confirming the results from the sequence analysis and the validity of the selection 
strategy. The results obtained with WinZip-A1f31 and the peptide pairs identifed in the. 
chain shuffling experiment, are supoorting the view that idealized sequences, Dased on 
the single principle of merely relieving repulsive e/g-interactions in the homodhners with 
complementary charges in the heterc-dime *, may not be optimal for olologiGal 
applications 



TABLE 1 
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